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Abstract 

Internet performance is tightly related to the properties of TCP and UDP 
protocols, jointly responsible for the delivery of the great majority of Inter- 
net traffic. It is well understood how these protocols behave under FIFO 
queuing and what the network congestion effects. However, no compre- 
hensive analysis is available when flow-aware mechanisms such as per-flow 
scheduling and dropping policies are deployed. Previous simulation and ex- 
perimental results leave a number of unanswered questions. In the paper, 
we tackle this issue by modeling via a set of fluid non-linear ODEs the 
instantaneous throughput and the buffer occupancy of N long-lived TCP 
sources under three per-flow scheduling disciplines (Fair Queuing, Longest 
Queue Fkst, Shortest Queue First) and with longest queue drop buffer man- 
agement. We study the system evolution and analytically characterize the 
stationary regime: closed-form expressions are derived for the stationary 
throughput/sending rate and buffer occupancy which give thorough under- 
standing of short/long-term fairness for TCP traffic. Similarly, we provide 
the characterization of the loss rate experienced by UDP flows in presence 
of TCP traffic. As a result, the analysis allows to quantify benefits and draw- 
backs related to the deployment of flow-aware scheduling mechanisms in 
different networking contexts. The model accuracy is confirmed by a set of 
ns2 simulations and by the evaluation of the three scheduling disciplines in 
a real implementation in the Linux kernel. 



1 Introduction 

1.1 Congestion control and per-flow scheduling 

Most of the previous work on rate controlled sources, namely TCP, has con- 
sidered networks employing FIFO queuing and implementing a buffer man- 
agement scheme like drop tail or AQM (e.g. RED pO)). 
As flow-aware networking gains momentum in the future Internet arena (see 
p8)), per-flow scheduling already holds a relevant position in today net- 
works: it is often deployed in radio HDR/HSDPA |21|, home gateways 
|22]|7), border IP routers iJ^IH)' ^EEE 802.11 access points |29|, but also 



ADSL aggregation networks. Even if TCP behavior has been mostly investi- 
gated under FIFO queuing, in a large number of significant network scenarios 
the adopted queuing scheme is not FIFO. 

It is rather fundamental, then, to explore the performance of TCP in these 
less studied cases and the potential benefits that may originate by the appli- 
cation of per-flow scheduling in more general settings, e.g. in presence of 
rate uncontrolled sources, say UDP traffic. 

In this paper we focus on some per-flow schedulers with applications in wired 
networks: 

• Fair queuing (FQ) is a well-known mechanism to impose fairness into 
a network link and it has already been proved to be feasible and scal- 
able on high rate links ( p8[|13[[T4) ). However, FQ is mostly deployed 
in access networks, because of the common belief that per-flow sched- 
ulers are not scalable, as the number of running flows grows in the 
network core. 
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• Longest Queue First (LQF). In the context of switch scheduling for 
core routers with virtual output queuing (VOQ), throughput maximiza- 
tion has motivated the introduction of an optimal input-output ports 
matching algorithm, maximum weight matching (MWM, see [119|), 
that, in the case of a N-inputs-1 -output router, reduces to selecting 
longest queues first. Due to its computational complexity, MWM has 
been replaced by a number of heuristics, all equivalent to a LQF sched- 
uler in a multiplexer. All results on the optimality of MWM refer to 
rate uncontrolled sources, leaving open questions on its performance 
in more general traffic scenarios. 

• Shortest Queue First (SQF). The third per-flow scheduler under study 
is a more peculiar and less explored scheduling discipline that gives 
priority to flows generating little queuing. Good properties of SQF 
have been experimentally observed in p2||7| in the context of home 
gateways regarding the implicit differentiation provided for UDP traf- 
fic. In radio access networks, as HDR/HSDPA, SQF has been shown, 
via simulations, to improve TCP completion times (e.g. pTj). How- 
ever, a proper understanding of the interaction between SQF and TCP/UDP 
traffic still lacks. 

Multiple objectives may be achieved through per-flow scheduling such 
as fairness, throughput maximization, implicit service differentiation. There- 
fore, explanatory models are necessary to give a comprehensive view of the 
problem under general traffic patterns. 

1.2 Previous Models 

A vast amount of analytical models is behind the progressive understanding 
of the many facets of Internet congestion control and has successfully con- 
tributed to the solution of a number of networking problems. To cite a few, 
fair rate allocation |[TT][T7]|, TCP throughput evaluation (see [23:20 'IT'S)) 
and maximization (Split TCP |_6^|, multi-path TCP ||9J), buffer sizing [25") , 
etc. Only recently, some works have started modeling TCP under per-flow 
scheduling in the context of switch scheduling (1^*261), once made the nec- 
essary distinction between per-flow and switch scheduling, the latter being 
aware of input/output ports and not of single user's flows. More precisely, 
the authors focus on the case of one flow per port, where both problems basi- 
cally fall into the same. In |j8J a discrete-time representation of the interaction 
between rate controlled sources and LQF scheduling is given, under the as- 
sumption of a per-flow RED-like buffer management policy that distributes 
early losses among flows proportionally to their input rate (as in [16, 24J ). 
Packets are supposed to be chopped in fixed sized cells as commonly done in 
high speed switching architectures. 

In |26J a similar discrete-time representation of scheduling dynamics is adopted 
for LQF and FQ, with the substantial difference of separate queues of fixed 
size, instead of virtual queues sharing a common memory. Undesirable ef- 
fects of flow's stall are observed in [26J and not in [8J due to the differ- 
ent buffer management policy. Indeed, the assumption of separate physical 
queues is not of minor importance, since it may lead to flow stall as a con- 
sequence of tail drop on each separate queue. Such unfair and unwanted 
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effects can be avoided using virtual queues with a shared memory jointly 
with Longest Queue Drop (LQD) buffer management (see ||28)). Such phe- 
nomenon is known and it has been noticed for the first time in ||28|. In this 
way, in case of congestion, sources with little queuing are not penalized by 
packet drops, allocated, instead, to more greedy flows. Both works (fSl, f26|) 
are centered on TCP modeling, while considering UDP (|26|) only as back- 
ground traffic for TCP without any evaluation of its performance. Finally, 
none of the aforementioned models has been solved analytically, but only 
numerically and compared with network simulations. 

1.3 Contribution 

The paper tackles the issue of modeling the three above mentioned flow- 
aware scheduling disciplines in presence of TCP/UDP traffic in a compre- 
hensive analytical framework based on a fluid representation of system dy- 
namics. We first collect some experimental results in Sec|2] In order to 
address the questions left open, we develop in Secj3] a fluid deterministic 
model describing through ordinary differential equations either TCP sources 
behavior either virtual queues occupancy over time. Model accuracy is as- 
sessed in Sec|4]via the comparison against ns2 simulations. In presence of 
N = 2 TCP flows, the system of ODEs presented in Secj3]is analytically 
solved in steady state and closed-form expressions for mean sending rates 
and throughputs are provided in Sec|5]for the three scheduling disciplines 
under study (SQF, LQF and FQ). The model is, then, generalized to the case 
of > 2 flows and in presence of UDP traffic in Sec|6] Interesting results 
on the UDP loss rate are derived in a mixed TCP/UDP scenario. A numer- 
ical evaluation of analytical formulas is carried out in Sec|7]in comparison 
with packet-level simulations. Finally, SecjH] summarizes the paper contri- 
bution and sheds light on potential applications of per-flow scheduling and 
particularly SQF. 



2 Experimental remarks 

In this section we consider a simple testbed as a starting point of our analysis. 
An implementation of FQ is available in Linux and, in addition, we have 
developed the two missing modules needed in our context, LQF and SQF. All 
three per-flow schedulers have similar implementations: a common memory 
is shared by virtual queues, one per flow. Packets belong to the same flow 
if they share the same 5-tuple (IP src and dst, port numbers, protocol) and 
in case of memory saturation the flow with the longest queue gets a packet 
drop (LQD). Our small testbed is depicted in FigjTjwhere sender and receiver 
employ the Linux implementation of TCP Reno. Different round trip times 
(RTTs) on each flow are obtained adding emulated delays through netem and 
iptables (||2)). Three TCP flows with different RTTs of 10, 100 and 200ms 
are run in parallel in the testbed, and their long term throughput is measured 
at the receiver Each test lasts 5 minutes and the throughput is averaged over 
10 runs. The result is reported in FiglTlbut they can also be seen through 
the Jain index fairness J (|10|) in Tab^ About the long term throughput, 
we observe that FQ is fair and the throughput is not affected by RTTs. On 
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FQ 


LQF 


SQF 


ST LT 
0.999 0.999 


ST LT 
0.734 0.736 


ST LT 
0.55 0.735 



Table 1: Jain index of fairness on short (0.5s) and long term (5min). 

the contrary, LQF and SQF suffer from a RTT bias: LQF favors flows with 
small RTT, while SQF favors flows with large RTT. Moreover Tab[T| shows 
that, while FQ and LQF show no difference between long and short term, 
SQF is much more unfair at short time scales as J reduces to 0.55 in the 
short term, while being 0.735 in the long term (J=l/N means that one flow 
out N gets all the resource over the time window, J=l is for perfect sharing). 
Another interesting metric is the ratio between the received rate (throughput) 




Figure 1: On the left: the experimental scenario. On the right: throughput's allo- 
cations. 

and the sending rate Rrcv/Rsnd which is equal to 99% for FQ, and LQF 
and 95% for SQF. In fact, SQF induces more losses w.r.t. the other two 
schedulers. Link utilization is not affected by the scheduler employed and is 
always about 100%. All these phenomena are difficult to study and explain 
properly through a testbed and should preferably be analyzed through an 
explanatory model. A number of questions should find an answer through 
the analysis: 

i) what is the instantaneous sending rate/throughput of TCP under these 
three schedulers? 

ii) what is the long term throughput? 

iii) in a general network traffic scenario, including also UDP flows, what 
is the performance of the whole system? 

3 Fluid Model 

The network scenario under study is a single bottleneck link of capacity C 
shared by a finite number TV of long-lived traffic sources of rate controlled 
(TCP) and rate uncontrolled (UDP) kind. We first consider the case of rate 
controlled TCP sources only. The mixed TCP/UDP scenario is studied in 
sec l6.2l 
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3.1 Assumptions 



Each traffic source is modeled at the flow timescale through a deterministic 
fluid model where the discrete packet representation is replaced by a contin- 
uous one, either in space and in time. Previous examples of fluid models of 
TCP/UDP traffic can be found in pO|3|5| . Let us summarize the assumptions 
behind the model and give the notation (reported in Tab|2]). 

• A rate controlled source is modeled by a TCP Reno flows in congestion 
avoidance phase, driven by the AIMD (Additive Increase Multiplica- 
tive Decrease) rule, thus neglecting initial slow start phase and fast 
recovery. 

• The round-trip delay, Rk{t) of TCP flow k, with fc = 1, A^, is de- 
fined as the sum of two terms: 

where PD^ is the constant round-trip propagation delay, (which in- 
cludes the transmission delay) and Qk{t) /C is the queuing delay, with 
Qk{t) denoting the instantaneous virtual queue associated to flow k. 
Remark that in presence of a 'per-flow' scheduler each flow k is mapped 
into a virtual queue Qk- The total queue occupancy, denoted as Q{t) = 
X]fc Qk{t), is limited to B. 

• Sending rate. The instantaneous sending rate of flow k, k ^ 1,2, . . . , N, 
denoted by Ak{t), is assumed proportional to the congestion window 
(in virtue of Little's law), 

Akit) = Wkit)/Rkit). 

• Buffer management mechanism. Under the assumption of longest 
queue drop (LQD) as buffer management mechanism, whenever the 
total queue saturates, the TCP flow with the longest virtual queue is 
affected by packet losses and consequent window halvings. 

3.2 Source equations 



According to the usual fluid deterministic representation ( 1 3][5]|6)), the send- 
ing rate linearly increase as 1 /R\{t) in absence of packet losses. Such repre- 
sentation usually employed for FIFO schedulers has to be modified in pres- 
ence of per-flow schedulers when flows are not all simultaneously in service. 
More precisely, it is reasonable to assume that flows do not increase their 
sending rate when not in service. It follows that in absence of packet losses, 
the increase of the sending rate is given by 

1 (. , Dkit). 



l{Q(t)=0} + — 7^1{Q(t)>0} 



The last expression accounts for linear increase whenever the total queue 
is empty and also for the reduction of the increase factor when multiple 
flows are simultaneously in service, proportional to their own departure rate, 
Dk{t). 
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c 


Link capacity 




N 


Number of TCP flows 




Rk{t) 


Round trip delay of flow k, k = 1, N 




Qk{t) 


Virtual queue of flow k, k = 1, N 




Q{t) 


Total queue of finite size B 




Ak{t) 


Sending rate of flow k, k — 1, N 




D,{t) 


Departure rate (or throughput) of flow k, k = 1, 


..,N 


Lk{t) 


Loss rate of flow k, k = 1, N 




qMAX 


Set of longest queues (Q*^^^ similarly defined) 




A? 


Set of bottlenecked flows, i.e. Ak{t) > C/N, k 


= 1,...,N 



Table 2: Notation 



Whenever a congestion event takes place (i.e. when the total queue Q reaches 
saturation, Q{t) — B) and the virtual queue Qk{t) is the largest one, flow 
k starts loosing packets in the queue at a rate proportional to the exceed- 
ing input rate at the queue, that is {Ak{t) — C)+. In addition, it halves 
its sending rate Ak{t) proportionally to {Ak{t) — C)~^ (we use the conven- 
tion (•)+ = max(-, 0)). Therefore, the instantaneous sending rate of flow k, 
k — 1, . . . ,N, satisfies the following ODE (ordinary differential equation): 

(1) 

where D^. denotes the departure rate for virtual queue k at time t (the additive 
increase takes place only when the corresponding virtual queue is in service) 
and Lk {t) denotes the loss rate of flow k defined by 



^ j (^(^) - C')^l{Q(t)=S}l{fe=argmax, Q,(t)} if Qj (t) < Qfc (i), Vj ^ A: 

\{Ak{t) - £'fc(t))+l{Q(t)=s}l{fc=argmax, Q,(t)} Otherwise. 

(2) 



The loss rate is proportional to the fraction of the total arrival rate A{t) = 
J2k^k{t) that exceeds link capacity when there exists only one longest 
queue. In presence of multiple longest queues, the allocation of losses among 
flows is made according to the difference between the input and the output 
rate of each flow. Of course, the reaction of TCP to the losses is delayed 
according to the round trip time Rk{t). 

Observation 3.1. Note that the assumption of zero rate increase for non-in- 
service flows is a consequence of the fact that the acknowledgement's rate 
is null in such phase. On the contrary, for FIFO schedulers, all flows are 
likely to loose packets and adjust their sending rate simultaneously, so one 
can reasonably argue that the acknowledgment rate is never zero. 

3.3 Queue disciplines 

The instantaneous occupation of virtual queue k, k — 1, . . . ,N, obeys to the 
fluid ODE: 

^Ml=Ak{t)-Dkit)-Lk{t). (3) 
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In this paper we consider three different work-conserving service discipHnes 
(for which Dfe = Cl^Q(^t)>o})' FQ (Fair Queuing), LQF (Longest Queue 
First), SQF (Shortest Queue First). The departure rate Dk {t) varies accord- 
ing to the chosen service discipline: 

• FQ: 

\Ak{t) xfAu{t)iAf 



• LQF: 



• SQF: 



Dk{t) - C= -* , l{Q;,(t)=max, g,(t)} 



Dk{t) - C= ^ . l{Qfc(t)=min, Q,(t)} 

The total loss rate is denoted by L{t) = Lk{t). The instantaneous occu- 
pation of the total queue Q{t) is, hence, given by 

^ = A(t)-Cl{Q(,)>o}-i(t). (4) 



4 Model accuracy 

Before solving the model presented in Sec|3] we present some packet level 
simulations using ns2 (|T|) to show the accuracy of the model. We have 
implemented LQF and SQF in addition to FQ which is akeady available 
in ns2; all implemented schedulers use shared memory and LQD buffer 
management. Network simulations allow to monitor some variables more 
precisely than in a test-bed as TCP congestion window (cwnd), and virtual 
queues time evolutions, ending rate evolution is then evaluated as the ratio 
cwnd/RTT and queue evolution is measured at every packet arrival, depar- 
ture and drop. This section includes some samples of the large number of 
simulations run to assess model accuracy. We present a simple scenario than 
counts two TCP flows with RTTs = 2ms, 6ms sharing the same bottleneck 
of capacity C = 10Mbps, with a line card using a memory of size 150kB. 
ns2 simulates IP packets of fixed MTU size equal to 1500B. In Figg. |2|3| 
we compare the system evolution predicted by ([l])-(|4]l against queue and rate 
evolution estimated in ns2 as the ratio cwnd/RTT (congestion window over 
round trip time, variable in time). Besides the intrinsic and known limitations 
of fluid models, not able to capture the burstiness at packet-level (visible in 
the sudden changes of queue occupancy), the match between packet level 
simulations and model prediction is remarkable. The short timescale oscilla- 
tions observed in SQF will be better explained in Sec|5] 



5 Analytical results 

Let us focus on the system of ODEs ([iJl-Q under the simplifying assumption 
of instantaneous congestion detection (the same assumption as in ||5J, ||6j|) and 
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Figure 2: Time evolution of rates (top) and queues (bottom) under SQF: the model 
on the left, ns2 on the right. 
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Figure 3: Time evolution of rates and queues under FQ (top) and LQF (bottom): 
the model on the left, ns2 on the right. 



of constant round trip delay. 



.,N. 



(5) 



The last assumption is reasonable when the propagation delay term is pre- 
dominant. In addition, the numerical solution of ([l)-(|4]) and ns2 simulations 
confirm that the system behavior in presence of variable {t) and delayed 
congestion detection appears to be not significantly different. Eq.Q becomes 



dAkjt) 
dt 



Dk{t) . 



Ak{t) 



(6) 
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In the following, we analytically characterize the solution of the system of 
ODEs (|4]l-(|6]) in presence of iV = 2 flows and for the three scheduling dis- 
ciplines under study. We first focus on SQF scheduling discipline, before 
studying the more intuitive behavior of the system under LQF and FQ in 
SecIO SeclSJl 



5.1 Shortest Queue First (SQF) scheduling discipline 

For the ease of exposition, denote a — ^ and (3 — -ks and take a > (3 (the 
same arguments hold in the dual case a < (3 where replacing a with (3 and 
viceversa). 

Lemma 5.1. The dynamical system described by under SQF schedul- 

ing discipline admits as unique stationary solution the limit-cycle composed 
by: 

• phase ; Vt € phase (when the origin coincides with the 

beginning of the phase) 

Mt) = 2^2Cf{2C,0,t), Mt)^(3t, 

Q2{t) f + ^ (/^" - ^)^"' ^1 W = ^ - Q2(t), 

where /(•) denotes the limiting composition of f{-) functions, /(•) == 
/ o / o • • • o / and 

/(a,6,i)^e-5(2fc-2C+;3t)/ 

+ ae^V2-. (Erf ^ ' ^ - Erf " ^ 



213 J V VW 

• phase : Vi € phase A^^ (when the origin coincides with the 
beginning of the phase) 

Ii (t) ^at, M (t) = 2y/^2Cg{2C, 0,t), 

Qi(i) = I + ^ {au - C)du, Q2{t) = B~ Qi{t), 

where g{-) is defined by symmetry w.r.t. /(•) and 
g(a,&,0 = e"4(2''-2C+at)/ 

2^+ae^^V2^[ Erf — - Erf ' 



2a J \ y/2a 

The duration of phase A'^^ is 2C /a, while that of phase is 2C/(3. The 
period to the limit cycle is, therefore, equal to 
T^ = 2c(i + i). 

Proof. The proof is structured into three steps: 1) we study the transient 
regime and analyze how the system reaches the first saturation point, i.e. the 
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time instant denoted by tg at which total queue occupancy Q attains satura- 
tion (Q = B), 

ts = mf{t > 0,Q{t) = B}. 

2) We show that once reached this state, the system does not leave the satu- 
ration regime, where the buffer remains full. 3) Under the condition Q{t) = 
B, yt > ts, ws characterize the cyclic evolution of virtual queues and 
rate and eventually show the existence of a unique limit-cycle in steady state 
whose characterization is provided analytically. 

1) With no loss of generality, assume the system starts with Q empty at t = 0, 
i.e. Qi{0) ^ Q2(0) = and initial rates, Ai{0) = A?,A2(0) = A^. 
At t = 0, TCP rates, Ai, A2 start increasing according to (j6]l and the vir- 
tual queues are simultaneously served at rate Ai{t), A2{t) until it holds 
A{t) < C. Hence, the queue Q remains empty until the input rate to the 
buffer, 

A{t) = Ai(t) + A2{t) = A\ + A° + {a + I3)t 

exceeds the Unk capacity C, in to — (C — — A2)/{a + (3). From this 
point on, the total queue starts filling in and TCP rates keep increase in time 
(with A{t) > C,yt > to- The system faces no packet losses until the first 
saturation point. Departing from a non-empty queue clearly accelerates the 
attainment of buffer saturation, but it is always true that at tg, the buffer is 
full (Q{ts) = B) and Ai{ts) + A2{ts) > C).In general one can state that 
independently of the state of the system at t = 0, the rate increase in absence 
of packet losses guarantees that the system reaches in a finite time the total 
queue saturation. When this happens, att = tg, the total input rate is greater 
than the output link capacity, i.e. A{ts) > C, which is equivalent to say that 
dQtldt\t=t^ > 0. 

2) Once reached the saturation, the auxiliary result in appendix [A|proves that 
the system remains in the saturation regime, that is Q{t) = B,\/t > tg. 

Observation 5.2. One can argue that in practice the condition of full buffer 
is only verified within limited time intervals, and that the queue occupancy 
goes periodically under B, when the buffer is emptying. Actually, ns2 sim- 
ulations and experimental tests confirm that tlie queue occupancy can in- 
stantaneously be smaller than B, but overall tlie average queue occupancy 
is approximately equal to B. Indeed, the difference w.r.t. a Drop Tail/FIFO 
queue is that SQF with LQD mechanism eliminates the loss synchronization 
induced by drop tail under FIFO scheduling discipline, so that in our setting 
when a flow experiments packet losses, the other one still increases and feeds 
the buffer 

3) Suppose Qi is the longest queue at tg (the dual case is completely 
symmetric). The rate Ai sees a multiplicative decrease according to (j6]) as 
Q{ts) — B and Qi is the longest virtual queue. By solving (j6|, the resulting 
rate decay of Ai is Vi > tg 

Aiit) = 2^Aiitg)fiAiitg),A2{tg),t - t,), (9) 

where /(•) is defined in (j7|. At the same time, A2 keeps increasing linearly. 
In terms of queue occupation, since we proved that Q{t) — B from tg on. 
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it means that the decrease due to packet drops in Qi is compensated by the 
increase of Q2- In fact, no matter the values of Qi and Q2 in tg, the rate at 
which Qi decreases is a function of the rate at which Q2 increases. Indeed, 

yt > ts, 

Qi{t) - Qi{ts) + ^ A^{v)dv - ^ {A^{v) + A2{v) - C)dv 
= B- Q2{ts) - {A2{v) - C)dv = B- Q2{t). 



1 = W2 = ^ 

denote by ti = tg + ti the first queue meeting point. 



It follows that the two queues become equal when Qi = Q2 = y ■ We can 



tl =inf |i >t.,^ {A2{v) - C)dv ^ ^10) 
The state of the system in ti is, 

^i(fi) = ei,A2(ti)=/3ti, Qi(ti) = Q2(ti) = |. (11) 

where we defined ti > 0, the value Ai{ti) given by (j9]), which we recall 
is a function of Ai{ts). From ti on, the two flows and thus the two vir- 
tual queues invert their roles: Qi enters in service, whereas Q2 suffers from 
packet losses and, consequently, A2 observes a multiplicative rate decrease, 
while Ai increases linearly. Since L{t) > 0, Vt > ti, 

Qi{t) ^ Qi{ti) + f (Ai(u)-C) 

= f + [ {ei+a{v~ h) - C)dv = B- QaW- 

and Qi{t\) — Q2{t'i) — B/2, if we denote by t\ — ti + t[ the next queue 
meeting point. 



t\ ^inf |i > I + ^ ' + a{v - ii) - C) = 1 1 (12) 

The state of the system in t'^ is, 

Ai(t'i) = £1 + aT[, A2{t\) = 71, Qi{t\) = Q2{t\) = |. 

where 71 = ^2(ti) = 27^^2(^1)3(^2(^1), ^1(^1 ), t'l-ti) (71 is a function 
of A2{ts)) and g{-) is defined in ([s]). 

Iteratively, one can construct a cycle for the virtual queue Qi {Q2) composed 
by two phases: 

• phase A2^ : when Qi {Q2) goes from B/2 to B/2 remaining smaller 
(greater) than i3/2 at a rate C — A2 (respectively A2 — C); 

• phase ^: when Qi (Q2) goes from B /2 to B/2 remaining greater 
(smaller) than i?/2 at a rate C — Ai (respectively Ai — C). 
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Figure 4: Cyclic time evolution of TCP rates under SQF scheduling. 

The duration of these phases depends on the values of A2 and Ai respectively 
at the beginning of phase Af^ and phase A^^. As in FigWl denote by e,; and 
7j the initial values of Ai and A2 at the beginning of the i phase A^^ and 
A2^ respectively. Thanks to the auxiUary result 2 in appendix |b] we prove 
that in steady state such initial values are zero and easily conclude the proof. 
Let us now provide analytical expressions for the mean stationary values of 
sending rate and throughput. As a consequence, one can also compute the 
mean values of the stationary queues, Qi, Q2 (one smaller, the other greater 
thanB/2). □ 

Corollary 5.3. The mean sending rates in steady state are given by: 
Ai « I ;C + — — -log I 1 + ^^Ce^^Sr/ 





\a + l3 2C(a + /3) \ ^ 
The mean throughput values in steady state are given by: 

a + p a + p 

The mean virtual queues in steady state are given by: 

B C^{a^ B C\a-p) 

Proof. The proof is reported in appendix [C] □ 
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5.2 Longest Queue First (LQF) scheduling discipline 

Lemma 5.4. The dynamical system described by (|5])-(|6| under LQF schedul- 
ing discipline admits as unique stationary solution. 





a 

w ( 


a + (3 






a 








a + P 


Qi 





C, X2 = -^c, 



a + (3 
a + (3 



Proof. As in the case of SQF, no matter the initial condition the system 
reaches the first saturation point, tg = inf {t > 0, Q{t) = B} with A{ts) > 
C and the buffer enters the saturation regime, that is Q{t) = B,\/t > tg. 
As an example, we can consider the case of initial condition Aj, Qi — 
Q2 = 0. At < = 0, Ai,A2 start increasing as for SQF while Q remains empty 
until the input rate to the buffer, 

A{t) = A? + A° + (a + f3)t 

exceeds the Unk capacity C, in to = {C — Ai — A2)/{a + (3). From this 
point on, the total queue starts filling in and TCP rates keep increasing in 
time according to Q with the only difference, w.r.t. the case of SQF, that 
Qi > Q2 is the longest queue served until buffer saturation and Q2 only 
gets the remaining service capacity, when there is one. The system faces no 
packet losses and Qi remains in service until tg. 

At ts, suppose with no loss of generality (the other case is symmetrical) that 
Qi is the longest virtual queue. The rate of TCPl, A\ sees a multiplicative 
decrease according to (j6]). By solving (j6]l, the resulting rate decay of A^ is 

Vt > ts 

A^{t) = 2^pA^(tg)j{A^(tg),A2{tg),t - tg), (13) 

where /(•) is defined in (j?]). At the same time, A2 increases linearly. Once 
proved that Q(t) = B,\/t > tg (see Appendix [a|i, no matter the values of 
Qi and Q2 in tg, it holds Qi{t) = B — Q2{t)- Thus, the two queues meet 
when Qi — Q2 — , at what we can denote by ti = tg + ti, the first queue 
meeting point. Once the queues reach the state Qi = Q2 — B/2, they cannot 
exit this state. Indeed, if we rewrite ([3]), it gives: 

with k = \,2 independently of the rate values. It implies that, Qi = Q2 = 
This can be explained by the following consideration: in presence of 
two longest queues Qi{t) — Q2{t) — B/2, both queues are served propor- 
tionally to the input rates Ai{t) and A2{t), and loose packets in the same 
proportion, so preventing each queue to become greater (smaller) than the 
other. 

About rates evolution, we now have two ODEs not anymore coupled with the 
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queue evolution, 



dt Rl A{t) 2 A{t) ^ ^ ' ' 

where we recall that we proved in appendix |a] that A{t) > C,yt > tg. The 
system of ODEs can be easily solved and it admits one stationary solution: 



Ai = ——^ \ l + \ 1 + ''' \- A2^'-A^ (16) 




In order to compute the mean throughput values in steady state, it suffices to 
observe that Xk — C{Ak/A), k = 1,2, which concludes the proof. Note 
that in practice, 8(a + /?)<< C^, that impUes A^ ~ Xj,, fc = 1, 2. □ 

5.3 Fair Queuing (FQ) scheduling discipline 

Lemma 5.5. The dynamical system described by Q-([6| under FQ schedul- 
ing discipline admits a unique stationary solution. 




Proof. As for SQF or LQF, no matter the initial condition the system reaches 
the first saturation point, tg = inf {t > 0, Q{t) B} with A{ts) > C and 
the buffer enters the saturation regime, that is Q{t) = B, \/t > tg. The 
evolution of the system until tg differs from that of SQF or LQF, as the two 
flows are allocated the fair rate until the virtual queues start filling in (when 
A{t) exceeds C) and then they are served at capacity ^ independently of a 
and f3. At tg, the system suffers from packet losses at rate L{t) = A{t) — C 
and the virtual queues Qi, Q2 evolve towards the state Qi{t) = Q2{t) = 
B/2, 

Qi{t) =Qi{ts) + ^ {Ai{v) - ^)dv - ^ {A^{v) + A2{v) - C)dv 



B-Q2{tg)~ J^{A2{v~tg)~^)dv 



= B-Q2{t). 

Once the queues reach the state Qk — B/2, Ak > C/2, k — 1,2 (otherwise 
one of the virtual queue would be empty) and the queues cannot exit this 
state. Indeed, if we rewrite the ODE (|3]l, it gives: 



dt ^ ' 2 y '^'^ ' 2 ^ 

with k ~ 1,2 independently of the rate values. It implies that. 
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About the rates evolution, we now have two ODEs not anymore coupled with 
the queue evolution, 

dAJt) 1 AJt) f . , , C\ 

The system of ODEs can be easily solved and it admits one stationary solu- 
tion: 



(20) 



In order to compute the mean throughput values in steady state, it suffices to 
observe that virtual queues are not empty and > C/2, fc = 1 , 2, therefore 
Xk = C/2, k = 1,2. In addition, note that, if 4q: << C^, condition usually 
verified in practice, « X^, k = 1,2. □ 

6 Model Extensions 
6.1 Extension to flows 

In Sec|5]we studied the case of iV = 2 TCP flows and analytically character- 
ize the stationary regime. For the case of > 2 flows, one can generalize the 
analytical results obtained in the two-flows scenario. More precisely, under 
the SQF scheduling discipline, the resulting steady state solution of (|4|l-(|6]l is 
a limit-cycle composed by N phases, each one denoted as A'j^^ phase (where 
flow k is in service) and of duration — , with ak = -bt- The mean stationary 
throughput associated to flow k is: 

SQF : Xk = ^n'^' C (21) 



Under LQF scheduling discipline, one gets a generalization of ( 16 1, where 
the mean stationary throughput associated to flow k is: 



LQF : Xk = —^^C (22) 

Similary, under FQ sheduling disciplines, it can be easily verified that there 
is still a fair aUocation of the capacity C among flows: 



FQ: ^fc = § (23) 

6.2 Extension to UDP traffic 

In this section we extend the analysis to the case of rate uncontrolled sources 
(UDP) competing with TCP traffic. We simply model a UDP source as a 
CBR flow with constant rate X^^^ . Consider the mixed scenario with one 
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TCP source and one UDP source traversing a single bottleneck link of capac- 
ity C. The system is described by a set of ODEs: 



^ = a (l{g..o> + ^l{g.>o}) - ^A.^L.^; (24) 
dt dt 

with the sending rate ofthe TCP flow, and A2(t) = ^2(0) = x'-'^p < C 
the sending rate of the UDP flow. The definition of Lk (t) is the same as in 

The interesting metric to observe in the mixed TCP/UDP scenario is the loss 

rate experienced by UDP in steady state, L = L2, which allows one to 
evaluate the performance of streaming applications against those carried by 
TCP 

Under SQF scheduling discipline, independently of the initial condition, the 
system reaches a first saturation point, ts, where Q{ts) — B. After tg, the 
flow with the longest queue suffers from packet losses. Depending on the ini- 
tial condition, it may be the UDP flow to first loose (if Q2{ts) > Qi{ts)) and 
in this case its virtual queue decreases until ti, where Qi{ti) = Q2{ti) = 
B/2 before entering in service and finally emptying. Indeed, for < t < ti, 
Q2{t) = Q2{ts) + Jl X^^P - {A,{u) + X^'^P - C)du = B- Qi(t), 

and for t > t^, Q2{t) = f + //^(X^^^ - C)du. Whether it is TCP to 
loose in ts, virtual queues both decrease until Q2 empties, i.e. > t^, 
Q2{t) = Q2{ts) + Ji {X^P'P — C)du. Once emptied, Q2 remains equal 
to zero, since it is fed at rate X^^^ < C. The remaining service capacity 
C - X^P'P is allocated to the TCP flow, that behaves as a TCP flow in iso- 
lation on a bottleneck link of reduced capacity, equal to C — X^^'^ . Thus, 
the loss rate of UDP is zero in steady state, 

SQF : L^^^ = 0. (25) 

When SQF is replaced by LQF, the outcome is the opposite. In fact, once the 
system reaches the first buffer saturation in tg, one can easily observe that 
virtual queues tend to equalize, either when TCP or UDP is the flow affected 
by packet losses. When Qi{ti) — Q2{ti) = B/2 mt = ti, virtual queues 
do not change anymore, since dQmax{t) — Ai — C + X^^'^ — {Ai + 
j^UDP — c) = 0, t > ti, with Qmax equal to Qi or Q2 depending on the 
initial condition, and the smaller one equal to i? — Qmax- The first ODE in 



(24 1 related to Ai admits one stationary solution, that is, 

80; 




2 V V {C-XUDP)2 I ■ 
Hence, the loss rate of UDP results to be 

f"-^C ^ ,26, 
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By replacing the expression of Ai in (6.2 1, we get 



-UDP ^ X^^P 

L > C- 



and 



VUDP I C--y"°P fr,, I Sa ' \ 
^ -I- 2 \^ ^ \J {C-Xuopy J 

vUDP 



rrno V^DP 
7 — = X^"^P 



It follows that: 



LQF : l'''''' « X^^'P (27) 



under the condition 2a << C^, usually verified in practice. It implies that 
the UDP loss rate is almost equal to its sending rate, and so the UDP through- 
put is almost zero, exactly the opposite result w.rt. SQF. Finally, under FQ 
as scheduling discipline, it is easy to observe that both flows tend to the fair 
rate, Xk ~ f"' ^ — 1; 2 and the loss rate of UDP is, hence, given by the 
difference 

FQ: I^""=(x^---^)\ (28) 



7 Numerical results 

This section presents numerical evaluations of formulas obtained in Sec|5] 
and Sec|6]and comparisons with packet level simulations. We focus on TCP 
sending rates, TCP throughputs and UDP loss rates for a selected scenario 
among a larger set. We consider a 10Mbps bottleneck link for three TCP 
flows with different RTTs. Tab|2]reports the numerical values of the compar- 
ison, showing a good agreement between model and ns2. In ns2 the link is 
almost fully utilized, although the model predicts 100% utilization, this mir- 
roring the model condition A{t) > C in steady state. The largest deviation is 
registered for the TCP sending rate in presence of SQF while the throughput 
is correctly predicted. We recall that while the throughput formula is exact, 
the sending rate formula is approximated. We also present, in Tab|4] a sce- 
nario on the performance of TCP and UDP sharing a common bottleneck of 
10Mbps, where the stream rate of UDP Xu^p is taken first smaller than the 
fair rate (C/2) and then larger. Results highlight that UDP stalls under LQF, 
while it gets prioritized under FQ as long as the stream rate is smaller than 
the fair rate. The possibility to implicitly differentiate traffic through FQ has 
been exploited in | l"2p3 1 for backbone and border routers where the fair rate 



is supposed to be very large. SQF appears to be much more effective to ful- 
fill this task as shown in Tab|4](and through experimentation in (22]), since 
it gives priority to UDP flows with any stream rate and allocates the rest of 
the bandwidth to TCP (see Fig|5]on top, for the time evolution). Hence, SQF 
turns out to be a promising per-flow scheduler to implement implicit service 
differentiation for UDP flows with large stream rates. 
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FQ [Mbps] 


LQF [Mbps] 


SQF [Mbps] 




ns2 


Model 


ns2 


Model 


ns2 


Model 




5.26 


5.00 


7.48 


7.35 


5.8 


6.32 


A2 


4.93 


5.00 


2.63 


2.65 


8.1 


8.68 


A 


10.2 


10.01 


10.01 


10.00 


13.90 


15.00 


Xi 


5.12 


5.00 


131 


7.35 


2.90 


2.65 


X2 


4.79 


5.00 


2.47 


2.65 


7.08 


7.35 


X 


9.91 


10.00 


9.85 


10.00 


9.99 


10.00 



Table 3: Numerical comparison between ns2 and the model: C = lOMbit/sec, 
Ri = 20ms, i?2 = 50ms. 





FQ [Mbps] 


LQF [Mbps] 


SQF [Mbps] 


jUDP 
^TCP 


ns2 Model 
UDP rate 
0.01 
6.69 7 


ns2 Model 

X^'^P = ZMhi 

2.98 3 

9.99 10 


ns2 Model 
9S < C/2 
0.1 
6.97 7 


jUDP 
^TCP 


UDP rate 
1.96 2 
4.98 5 


X^^P = 7Mbi 
6.95 7 
9.87 10 


js > C/2 
0.1 
2.98 3 



Table 4: Scenario TCP/UDP: C = lOMbit/sec, Ri = 20ms. 
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Figure 5: Top plots: TCP and UDP under SQF scheduling. Bottom plot: 3 TCP 
flows under SQF with RTTs=2ms,5ms,10ms, B=30kB. 
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Short-term fairness in SQF can also be observed in Fig|5] on the bottom, 
where three TCP flows share a common bottleneck of 10Mbps. SQF allocates 
resources to TCP as a time division multiplexer with timeslots of variable size 
proportional to RTTs. This is the reason why UDP traffic is prioritized over 
TCP 

8 Discussion and Conclusions 

The main contribution of this paper is given by the set of analytical results 
gathered in Sec |5|6| that allow to capture and predict, in a relatively simple 
framework, the system evolution, even in scenarios that would be difficult to 
study through simulation or experiments. An additional, non marginal, con- 
tribution of the analysis is that we can now give a solid justification of many 
feasible applications of such per-flow schedulers in today networks. 
FQ and LQF. The absence of a limit-cycle for FQ and LQF in the stationary 
regime explains the expected insensitivity of fairness w.r.t. the timescale. In 
LQF, throughput is biased in favor of flows with small RTTs (the opposite 
behavior is observed in SQF). FQ, on the contrary, provides no biased allo- 
cations. LQF has clearly applications in switching architectures to maximize 
global throughput. However, the throughput maximization in switching fab- 
rics in high speed routers does not take into account the nature of Internet 
traffic, whose performance are closely related to TCP rate control. As a con- 
sequence, flows that are bottlenecked elsewhere in the upstream path stall, 
and UDP streams stall as well. 

SQF. The analysis of SQF deserves particular attention. This per-flow sched- 
uler has shown to have the attractive property to implicitly differentiate stream- 
ing and data traffic, performing as a priority scheduler for applications with 
low loss rate and delay constraints. Implicit service differentiation is a great 
feature since it does not rely on the explicit knowledge of a specific appli- 
cation and preserves network neutrality. Recent literature on this subject has 
exploited FQ in order to achieve this goal through the cross-protect concept 



(1 12 15 1). Cross-protect gives priority to flows with a rate below the fair 
rate and keeps the fair rate reasonably high through flow admission control 
to limit the number of flows in progress. SQF achieves the same goal of 
cross-protect with no need of admission control as it differentiates applica- 
tions with rate much larger than the fair rate. FQ is satisfactory on backbone 
links where the fair rate can be assumed to be sufficiently high, whereas 
SQF has the ability to successfully replace it in access networks. Further- 
more, SQF has a very different behavior from the other per-flow schedulers 
considered in this paper The instantaneous rate admits a limit cycle in the 
stationary regime, both for queues and rates, in which flows are served in 
pseudo round-robin phases of duration proportional to the link capacity and 
to RTTs, so that long term throughput is biased in favor of flows with large 
RTTs. Note that this oscillatory behavior, already observed for TCP under 
drop tail queue management and due to flow synchronization, here is due to 
the short term unfairness of SQF: the flow associated to the smallest queue 
is allocated all the resources without suffering from packet drops caused by 
congestion. For instance, in LQF such phases do not exist because the flow 
with the longest queue is at the same time in service and penalized by con- 
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gestion packet drops. Additional desired effects of SQF on TCP traffic, not 
accounted for by the model, derive by the induced acceleration of the slow 
start phase (when a flow generates little queuing), with the consequent global 
effect that small data transfers get prioritized on larger ones. As a future 
work, we intend to investigate the effect of UDP burstiness, not captured by 
fluid models, on SQF implicit differentiation capability, via a more realistic 
stochastic model. 
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A Auxiliary result 1 

Proposition A.l. Given Q{ts) = B and A{ts) > C,Wt > tg, the total 
queue remains full, i.e. Q{t) — B. 
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Proof. To prove the statement is equivalent to prove that A{t) > C, 
Mt > ts. hi fact, under the condition A{t) > C, = A{t) - C ~ 

{A{t) - C) = 0. Let us take eq.(j6]l and write it for A{t)yt > 1^: 

dAit) ^ 1 Dk A{t) 

k = l " 

Since the derivative of A{t) is lower-bounded by {A{t) — C)+ and 
becomes zero when A{t) — C, it is clear that starting from A{tg) > C 
the total rate will remain greater or equal to C, Vt > tg, which is enough 
to prevent Q, initially full in tg, from decreasing. □ 



B Auxiliary result 2 

The following result is the proof of the existence of the limit the se- 
quences {ei}, {7i}, V« e N and by symmetry {7;}. 

Proposition B.l. The positive sequences {ei}, {7^}, with 

1. {ci}, Vi e N defined by the recursion 

e^+i = ei2\//3/(ej + an,ji,T-), i > 1 

where 



C 2ae^\ , C 2/37,_i 

2. {7i}, Vi € N defined by the recursion 

7^+1 = 7i2\/a5(7i + /3t- , e^, n+i), i > 1 
are convergent and their limits are ? = 0, 7 = 0. 

Proof. Take the sequence {e^}. It holds: 

Q+i < (ei + aTi)2v//3/(ei + aTi,0,T-) 

< i^r + 2C) 

< (e, + 2C) K < eiK'-^ + ^{2Cy-JK^ 



i-1 / \ j 



tV2c; 

where we defined if = 2V/3/ (^2^^ + 2Ce^^ ^/2^Erf{C/^/2(3) 
It follows that 

lime,if- + (2C)^gm^0. (30) 
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Indeed, it results K < 1 < 2C, since the link capacity satisfies C » 
1. Since we assumed e.; > 0, Vz > 1, it implies that?= limj^oo — 0, 
that is the sequence converges to zero. By symmetry, 7 = lim^^oo 7i ~ 
0, which concludes the proof. □ 



C Proof of Corollary 5.3 



Proof. We denoted by Ai and A2 the mean values of the sending rates 
in steady state, that is 



A, ^ ^ r A,{u)du, A2 ^ I; r A2{u)du. (31) 

Decompose the integral over the limit cycle T in the sum of the integral 
over phase Af^ and It results: 

2C/a+2C//3 _ \ 

/(2C,0,u)du (32) 

2C/a J 
2C/Q+2C//3 \ 

/(2C, 0, u)du 

2C/a J 

2C(^^°S(^ + ^^^^^^^^ 

where we approximated / with /. Similarly, one gets the approximate 
expression for A2- It is worth observing that the above approximation 
is justified by the numerical comparison of mean sending rates against 
ns2. In addition, it is important to distinguish among TCP sending rates 
Ai, A2 (where ^ stands for the stationary version) and throughputs, 
Xi, X2: during phase Ai^ , Qi is served at capacity C, so Xi = C 
and X2 = 0, as all j)ackets sent by flow 2 are lost. On the contrary, 
during phase ^2*^, X2 — C, whereas Xi ~ 0. It results that: 

— a —8 

Xi = -—,c, X2 = -^C. 

a + p a + p 

In order to compute the value of the virtual queues in steady state, it 
suffices to recall that Qi{t) = B — Q2{t) and over phase Qk = 

^ + J{Ak{t) ~ C)dt = f + /(at - C)dt. Hence, 







]dt+ I {B~Q2{t))dt 

phase AO" / 
(33) 

where T denotes the limit-cycle duration, i.e. T = 2C(^ + Simi- 
larly, one gets 



B _ C^{a-(3) 



Q2 = W TTT^- (34) 
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Under the assumption a > /?, it results that, as expected, > B/2, 
while Q2 < B/2. 

Note in addition that the approximation /(•) w /(•) in (32 1 has no im- 
pact neither on the mean throughput nor on the virtual queue formulas. 

□ 
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